6 research outputs found

    Developing, Evaluating and Scaling Learning Agents in Multi-Agent Environments

    Full text link
    The Game Theory & Multi-Agent team at DeepMind studies several aspects of multi-agent learning ranging from computing approximations to fundamental concepts in game theory to simulating social dilemmas in rich spatial environments and training 3-d humanoids in difficult team coordination tasks. A signature aim of our group is to use the resources and expertise made available to us at DeepMind in deep reinforcement learning to explore multi-agent systems in complex environments and use these benchmarks to advance our understanding. Here, we summarise the recent work of our team and present a taxonomy that we feel highlights many important open challenges in multi-agent research.Comment: Published in AI Communications 202

    Managing learning interactions for collaborative robot learning

    Get PDF
    Robotic assistants should be able to actively engage their human partner(s) to generalize knowledge about relevant tasks within their shared environment. Yet a key challenge is not all human partners will be proficient at teaching; furthermore, humans should not be held accountable for tracking a robot’s knowledge over time in a dynamically changing environment, across multiple tasks. Thus, it is important to enable these interactive robots to characterize their own uncertainty and equip them with an information gathering policy for asking the appropriate questions of their human partners to resolve that uncertainty. In this way, the robot shares the responsibility in guiding its own learning process and is a collaborator in the learning. Additionally, given the robot requires some tutelage from its partner, awareness of constraints on the teacher’s time and cognitive resources available for devoting to the interaction could help the agent to use the time allotted more wisely. This thesis examines the problem of enabling a robotic agent to leverage structured interaction with a human partner for acquiring concepts relevant to a task it must later perform. To equip the agent with the desired concept knowledge, we first explore the paradigm of Learning from Demonstration for the acquisition of (1) training instances as examples of task-relevant concepts and (2) informative features for appropriately representing and discriminating between task-relevant concepts. Given empirical evidence that a human partner can be helpful to the agent in solving the concept learning problem, we subsequently investigate the design of algorithms that enable the robot learner to autonomously manage interaction with its human partner, using a questioning policy to actively gather both instance and feature information. This thesis seeks to investigate the following hypothesis: In the context of robot learning from human demonstrations in changeable and resource-constrained environments, enabling the robot to actively elicit multiple types of information through questions, and to reason about what question to ask and when, leads to improved learning performance.Ph.D

    Multimodal Real-Time Contingency Detection for HRI

    Get PDF
    © 2014 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Presented at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2014), 14-18 September 2014, Chicago, IL.DOI: 10.1109/IROS.2014.6943025Our goal is to develop robots that naturally engage people in social exchanges. In this paper, we focus on the problem of recognizing that a person is responsive to a robot’s request for interaction. Inspired by human cognition, our approach is to treat this as a contingency detection problem. We present a simple discriminative Support Vector Machine (SVM) classifier to compare against previous generative meth- ods introduced in prior work by Lee et al. [1]. We evaluate these methods in two ways. First, by training three separate SVMs with multi-modal sensory input on a set of batch data collected in a controlled setting, where we obtain an average F₁ score of 0.82. Second, in an open-ended experiment setting with seven participants, we show that our model is able to perform contingency detection in real-time and generalize to new people with a best F₁ score of 0.72
    corecore